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® (57) Abstract: The present invention is based on the sequencing and assembly of the human genome. The present invention provides 
Q the primary nucleotide sequence of the coding portion of the human genome in the form of a scries of transcript sequences with 
accompanying exon information. This information can be used to generate nucleic acid detection reagents and kits such as nucleic 
^ acid arrays, and for other uses. 
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ORIGIN 



linear PAT 03-FEB-2004 



CQ728483 884 bp DNA 

Sequence 14417 from Patent WO02068579. 
CQ728483 

CQ728483.1 GI:42297418 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Euarchontoglires ; Primates; Catarrhini; 
Hominidae; Homo. 
1 

Venter, C. J., Adams, M.C., Li,P.W. and Myers , E .W. 

Kits, such as nucleic acid arrays, comprising a majority of 

humanexons or transcripts, for detecting expression and other uses 

thereof 

Patent: WO 02068579-A 14417 06-SEP-2002; 
PE Corporation (NY) (US) 

Location/Qualifiers 

1. .884 

/organism="Homo sapiens" 
/mol_type="unas signed DNA" 
/db xref= u taxon: 9606" 



Query Match 9.4%; Score 298; DB 6; Length 884; 

Best Local Similarity 100.0%; Pred. No. 3e-159; 

Matches 298; Conservative 0; Mismatches 0; Indels 



0; Gaps 



0; 



Qy 

Db 



94 6 GAGGGCTATATCAAGTACTCTGCACTCTTCTATGGCTACTACAACAACCAGAGGACCATC 1005 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
187 GAGGGCT AT AT CAAGTACT CT GCACT CTT CTAT GGCT ACTACAACAACCAGAGGACCAT C 24 6 



Qy 

Db 

Qy 

Db 

Qy 

Db 



1006 GGGTGGCTGAGGTACCGGCTGCCTATGGCTTACTTTATGGTGGGGGTCAGCGTGTTCGGC 1065 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I 

247 GGGTGGCTGAGGTACCGGCTGCCTATGGCTTACTTTATGGTGGGGGTCAGCGTGTTCGGC 306 

1066 T AC AG C CT GAT TAT T G T CAT T C GAT C GAT G G C C AG C AAT AC C CAAG GAAG C AC AG G C GAA 1125 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

307 T AC AGC CT GAT TAT T GT CAT T C GAT C GAT GGC CAG C AATAC C C AAGGAAGC AC AGG C GAA 366 

112 6 GGGGAGAGT GACAACT T CACATTCAG CTT CAAGAT GT T CAC C AGCTGGGAC TAC CT GATC 1185 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

367 G G GGAGAGT GACAACT T CACATT CAG CTT CAAGAT GT T CAC C AGCT G GGAC TAC CT GAT C 426 



Qy 

Db 



1186 
427 



G G GAAT T C AGAG ACAG CT GAT AAC AAAT AT G CAT C CAT CAC CAC CAG CTT CAAG GAAT 1243 

I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

G G GAAT T CAGAGACAG CT GATAACAAATAT G CAT C CAT CAC CAC CAG CTT CAAG GAAT 4 84 
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ORIGIN 



Alignment Scores: 

Pred. No.: 4.1e-81 

Score: 99.00 

Percent Similarity: 100.00% 
Best Local Similarity: 100.00% 

Query Match: 10.93% 

DB: 6 



Length: 884 

Matches: 99 

Conservative: 0 

Mismatches: 0 

Indels: 0 

Gaps : 0 



US-10-792-307-4 (1-906) x CQ728483 (1-884) 



Qy 



Db 



Qy 



Db 



311 GluGlyTyrlleLysTyrSerAlaLeuPheTyrGlyTyrTyrAsnAsnGlnArgThrlle 330 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
187 GAGGGCTATATCAAGTACTCTGCACTCTTCTATGGCTACTACAACAACCAGAGGACCATC 24 6 

331 GlyTrpLeuArgTyrArgLeuProMetAlaTyrPheMetValGlyValSerValPheGly 350 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I M M I M I > M I 
247 GGGTGGCTGAGGTACCGGCTGCCTATGGCTTACTTTATGGTGGGGGTCAGCGTGTTCGGC 306 



Qy 

Db 

Qy 

Db 



351 TyrSerLeuIlelleVallleArgSerMetAlaSerAsnThrGlnGlySerThrGlyGlu 370 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
307 T AC AGC CT GAT T ATT GT CAT T C GAT C GAT GG C CAGCAAT AC C CAAG GAAG C ACAG GC GAA 366 

371 GlyGluSerAspAsnPheThrPheSerPheLysMetPheThrSerTrpAspTyrLeuIle 390 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I 
367 GGGGAGAGTGACAACTTCACATTCAGCTTCAAGATGTTCACCAGCTGGGACTACCTGATC 426 



Qy 

Db 



391 GlyAsnSerGluThrAlaAspAsnLysTyrAlaSerlleThrThrSerPheLysGlu 4 09 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I 
427 G GGAAT T CAGAGACAG CT GAT AACAAATAT GCAT C CAT CAC CAC C AG CT T CAAG GAA 4 83 



